System and method for a direct computation of cubic spline interpolation for real-time image codec

ABSTRACT

A cubic spline interpolation (CSI) method and apparatus for image compression by direct computation. Such a CSI method is used along with the JPEG standard to obtain a new CSI-JPEG encoder-decoder (Codec). In one embodiment, the present invention is a method and apparatus for a new CSI-JPEG codec enabling a pipelined compression method that is naturally suitable for hardware or firmware implementations.

CROSS REFERENCE TO RELATED APPLICATIONS

This patent application claims the benefit of the filing date of U.S. Provisional Patent Application Ser. No. 60/478,381, filed Jun. 12, 2003, and entitled “System and Method for A Direct Computation of Cubic Spline Interpolation for Real-Time Image Codec,” the entire contents of which are hereby expressly incorporated by reference.

FIELD OF THE INVENTION

This invention relates to data compression. More specifically, the invention relates to a new cubic-spline interpolation (CSI) for both 1-D and 2-D signals to sub-sample signal and image compression data.

BACKGROUND OF THE INVENTION

In T. K. Truong, L. J. Wang, I. S. Reed, and W. S. Hsieh, “Image data compression using cubic convolution spline interpolation,” IEEE Trans. on Image Processing, vol. 9, no. 11, pp. 1988–1995, November 2000 [1]; and L. J. Wang, W. S. Hsieh, T. K. Truong, I. S. Reed, and T. C. Cheng, “A fast efficient computation of cubic-spline interpolation in image codec,” IEEE Trans. on Signal Processing, vol. 49, no. 6, pp. 1189–1197, June 2001 [2], the entire contents of which are hereby expressly incorporated by reference, a cubic spline interpolation (CSI) is developed in order to subsample image data to achieve compression. The CSI scheme is combined with the JPEG algorithm to develop a modified JPEG encoder-decoder, which obtains a higher compression ratio and a better quality of reconstructed image than the standard JPEG In the CSI algorithm developed in [1], a fast Fourier transform (FFT) algorithm used in the modified JPEG encoder, is applied to perform the circular convolution needed to compress and reconstruct image data.

Recently, the authors in [2] showed that if the size of compressed image is not chosen to be power of two, the usual 2-D FFT is not the best algorithm needed to obtain the compressed image values. To overcome this problem, the authors proposed the Winograd discrete Fourier transform (WDFT) with the overlap-save method instead of the FFT to implement the CSI scheme. The disadvantage of this faster CSI algorithm is the overlap-save method with its required boundary conditions. Thus this algorithm though faster is not readily realized as a real-time processor.

Therefore, there is a need for a method and apparatus for a faster and more efficient computation of a CSI for image signals.

SUMMARY OF THE INVENTION

In one embodiment, the present invention is a method performed by a computer for encoding a signal including defining a 1-D cubic-spline filter by

$\begin{matrix} {{R(t)} = \left\{ \begin{matrix} {{{\left( {3/2} \right){t}^{3}} - {\left( {5/2} \right){t}^{2}} + 1},} & {0 \leq {t} < 1} \\ {{{{- \left( {1/2} \right)}{t}^{3}} + {\left( {5/2} \right){t}^{2}} - {4{t}} + 2},} & {{1 \leq {t} < 2};} \\ {0,} & {{2 \leq {t}};} \end{matrix} \right.} & (1) \end{matrix}$

-   -   applying the filter to an input signal x(t) with

$\begin{matrix} \begin{matrix} {{y_{j} = {\sum\limits_{t = {{{- 2}\;\tau} + 1}}^{{2\;\tau} + 1}{{r(t)}\;{x\left( {t + {j\;\tau}} \right)}}}},} & \; & {{0 \leq j \leq {n - 1}},} \end{matrix} & (3) \end{matrix}$

-   -    to compute y_(j);     -   computing B=[b₀,b₁, . . . ,b_(n−1)]_(C), where B denotes a         cyclic matrix of size n×n, where

$\begin{matrix} \begin{matrix} {b_{k} = {\sum\limits_{t = {{{- 2}\;\tau} + 1}}^{{2\;\tau} + 1}{{r(t)}\;{{r\left( {t + {k\;\tau}} \right)}.}}}} & \; & \; & {0 \leq k \leq {n - 1}} \end{matrix} & (4) \end{matrix}$

-   -   and where b₀=α=1.641, b₁=b_(n−1)=β=0.246, b₂=b_(n−2)=γ=−0.07,         b₃=b_(n−3)=δ=0.004, b₄=0, b₅=0, . . . , b_(n−4)=0;     -   computing A=B⁻¹, where A is a circular matrix of size n×n, where         A=[a₀,a₁,a₂,a₃,a₄,a₅,a₆, . . . ,a_(n−6),a_(n−5),         a_(n−4),a_(n−3),a_(n−2),a_(n−1)]_(C)  (6)     -    and where a₀=0.646, a₁=a_(n−1)=−0.109, a₂=a_(n−2)=0.0467,         a₃=a_(n−3)=−0.014, a₄=a_(n−4)=0.0046, a₅=a_(n−5)=−0.00148, and         a₆≅a₇≅a₈≅ . . . ≅a_(n−6)≅0; and     -   computing         X=B⁻¹Y=AY  (7)     -    by computing

$\begin{matrix} {x_{j} = {\sum\limits_{k = 0}^{n - 1}{y_{k}\mspace{11mu}{a_{{({j - k})}_{n}}.}}}} & (8) \end{matrix}$

BRIEF DESCRIPTION OF THE DRAWINGS

The features of this invention will become more apparent from a consideration of the following detailed description and the drawings, in which:

FIG. 1 is an exemplary One-dimensional (1-D) cubic convolution interpolation function;

FIG. 2 is an exemplary 1-D compression filter, according to one embodiment of the present invention;

FIG. 3 is an exemplary JPEG encode/decoder, according to one embodiment of the present invention; and

FIGS. 4( a)–4(d) are exemplary reconstructed images.

DETAIL DESCRIPTION

A cubic spline interpolation (CSI) for 2-D signals is performed by a direct computation in order to encode and decode compression for image coding. A pipeline structure in an electronic system or an integrated chip (IC) can be used to implement this new CSI. Such a new CSI method can be used along with the JPEG standard to obtain a new CSI-JPEG encoder-decoder (Codec) while still maintaining a good quality of the reconstructed image using higher compression ratios. In one embodiment, the present invention is a method and apparatus for a new CSI-JPEG codec enabling a pipelined compression method that is naturally suitable for hardware or firmware implementations.

In one aspect, the present invention is a method performed by a computer for a direct computation of a cubic spline interpolation (CSI) for image signals. In another aspect, the present invention is an integrated chip (IC) that is configured to perform the above method. In yet another aspect, the present invention is a digital signal processor (DSP) that is configured to perform the above method.

Since a large number of zeros exists in the interval occupied by the filter coefficients of the CSI scheme (see, [1]), it is shown here that a direct computation, instead of using the FFT or WDFT algorithm, is developed for computing the required circular convolution for any size of image. This new algorithm is utilized to aid in the JPEG standard to obtain a new JPEG codec. The advantage of this new CSI procedure over all other CSI methods (see, [1], and [2]) is that it can be implemented by a pipeline structure and is naturally suitable for very large scale integration (VLSI) implementation. The comparison of the operations of this new algorithm, the CSI algorithm, and the WDFT CSI algorithm shows that the WDFT CSI algorithm requires fewer multiplications than both the new CSI and the conventional CSI algorithms. However, the new type of CSI algorithm requires less additions than both the WDFT CSI and the conventional CSI algorithms. Finally, computer runs show that for some images of size 640×480, for example, the computation time of this new CSI-JPEG encoder that is implemented by a direct computation requires only 1.28 sec compared with 1.52 sec for the typical CSI-JPEG encoder of [1] and 1.11 sec for the WDFT CSI-JPEG encoder of [2] with almost the same PSNR for the reconstructed image. That is, the new CSI-JPEG encoder requires 0.24 sec less time than the typical CSI-JPEG encoder and 0.17 sec less time than the WDFT CSI-JPEG encoder. In one embodiment, a pipeline structure can be developed to implement the new CSI-JPEG encoder. As a result, the new CSI-JPEG encoder described here is easier to implement in hardware or firmware than the previous compression algorithms.

Direct Computation of the CSI Encoder for 2-D Image Signal

It was shown in [1] that the idea of the CSI scheme is to recalculate the sampled values of the image data by means of the least-squares method that uses of cubic convolution interpolation (CCI) formula. In this section it is shown that a direct computation can be utilized for 2-D image data.

To illustrate the encoding algorithm of the CSI method developed in [1] and [2], let τ be a fixed positive integer. Also, let x(t) be periodic with period N=nτ, where n is a positive integer. From S. Hou and H. C. Andrews, “Cubic splines for images interpolation and digital filtering,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-29, pp. 1153–1160, December 1981 [3], the entire content of which is hereby expressly incorporated by reference, the 1-D CCI function r(t) shown in FIG. 1, is defined by

$\begin{matrix} {{r(t)} = {\begin{Bmatrix} {{{\left( {3/2} \right){t}^{3}} - {\left( {5/2} \right){t}^{2}} + 1},} & {0 \leq {t} < 1} \\ {{{{- \left( {1/2} \right)}{t}^{3}} + {\left( {5/2} \right){t}^{2}} - {4{t}} + 2},} & {1 \leq {t} < 2} \\ {0,} & {2 \leq {t}} \end{Bmatrix}.}} & (1) \end{matrix}$

One defines the k-th shift function of the CCI function r(t) as ψ_(k)(t)=r(t−kτ) for 0≦k≦n−1, where r(t) is assumed to be a periodic function of period N. Also, let x_(i) for 0≦i≦n−1 be the compressed values (coefficients) at the sampling points which represent the compressed data to be transmitted or stored. It follows from [1] that the set of equations as the circular convolution is given by

$\begin{matrix} \begin{matrix} {{y_{j} = {\sum\limits_{k = 0}^{n - 1}{x_{k}b_{{({j - k})}_{n}}}}},} & \; & {0 \leq j \leq {n - 1}} \end{matrix} & (2) \end{matrix}$ where (j−k)_(n) denotes the residue (j−k) mod n,

$\begin{matrix} \begin{matrix} {y_{j} = {\sum\limits_{t = {{{- 2}\;\tau} + 1}}^{{2\;\tau} + 1}{{r(t)}\;{x\left( {t + {j\;\tau}} \right)}}}} & \; & {{0 \leq j \leq {n - 1}},} \end{matrix} & (3) \\ {and} & \; \\ \begin{matrix} {b_{k} = {\sum\limits_{t = {{{- 2}\;\tau} + 1}}^{{2\;\tau} + 1}{{r(t)}\;{{r\left( {t + {k\;\tau}} \right)}.}}}} & \; & {0 \leq k \leq {n - 1}} \end{matrix} & (4) \end{matrix}$

y_(j) in (2) is the n-point circular convolution of the compressed data x_(k) with the coefficients b_(k) obtained by equation (4) for 0≦k≦n−1. It was shown in [1] that using the FCSI-JPEG, the subjective quality of the reconstructed image for τ=2 is better than that of τ=3. Thus, the special case τ=2 is considered in this application. It is not difficult to show that the number of real multiplications and real additions that are needed to compute equation (3) is 9×n and 8×n, respectively. Because of the periodicity of the CCI function r(t), from equation (4), one obtains the coefficients b₀=α, b₁=b_(n−1)=β, b₂=b_(n−2)=γ, b₃=b_(n−3)=δ, b₄=0, b₅=0, . . . , b_(n−4)=0, where α, β, γ and δ are obtained by the use of equation (8) in [1], and are shown below. These coefficients are the autocorrelation coefficients between two CSI functions. Now let [ ]^(T) denote the transpose of the column matrix X. Thus y_(j) in equation (2) can be expressed in matrix form as follows: Y=BX  (5) where Y=[y₀, y₁, . . . , y_(n−1)]^(T), X=[x₀, x₁, . . . , x_(n−1)]^(T) and B=[b₀,b₁, . . . ,b_(n−1)]_(C) denotes the cyclic matrix of size n×n. From [1], one obtains the coefficients α=1.641, β=0.246, γ=−0.07 and δ=0.004. It follows from [1] that the FFT can be used to solve (2) or (5) for x_(k), where 0≦k≦n−1.

Because of a large number of zeros in the coefficients, i.e. b_(k) for 0≦k≦n−1, computationally, the FFT algorithm of [1] is very inefficient to perform the circular convolution given in equation (2). In order to speed up the CSI algorithm for any size of the compressed image, a direct computation, instead of the FFT or WDFF in [2], is developed to perform the n-point circular convolution given in equation (2) for solving x_(k) for 0≦k≦n−1. In other words, one can compute the n-point circular convolution of any size of sampling points in equation (2) by the means of a direct computation due to the large number of zeros of the coefficients, b_(k), where 0≦k≦n−1. To illustrate this, one first finds the inverse matrix of B given in equation (5). It is well known from matrix theory that if B in equation (5) is a circular matrix, then the inverse matrix, namely A=B⁻¹ is also a circular matrix of size n×n. That is, A=[a₀,a₁,a₂,a₃,a₄,a₅,a₆, . . . ,a_(n−6),a_(n−5), a_(n−4),a_(n−3),a_(n−2),a_(n−1)]_(C)  (6) where a₀=0.646, a₁=a_(n−1)=−0.109, a₂=a_(n−2)=0.0467, a₃=a_(n−3)=−0.014, a₄=a_(n−4)=0.0046, a₅=a_(n−5)=−0.00148, and a₆≅a₇≅a₈≅ . . . ≅a_(n−6)≅0 which can be pre-computed. Note that the values of constants a_(i) for 6≦i≦n−6 are within 10⁻⁶. It can be shown by computer simulation that these constants can be assumed to be zero without degrading the quality of the reconstructed image. The solution to Y=BX given in equation (5) will then be expressed in matrix form as X=B ⁻¹ Y=AY  (7) where A=B⁻¹ is a circulant matrix given in (6). (7) is thus reduced to the form

$\begin{matrix} {x_{j} = {\sum\limits_{k = 0}^{n - 1}{y_{k}\mspace{11mu} a_{{({j - k})}_{n}}}}} & (8) \end{matrix}$

For τ=2, since the coefficients a_(i) for 0≦i≦n−1 given in equation (6) are periodic with period n, one can rearrange these coefficients as a⁻⁵=a₅=−0.0148, a₄=a⁻⁴=0.0046, a₃=a⁻³=−0.14, a⁻²=a₂=0.0467, a⁻¹=a₁=−0.109, and a₀=0.646. These coefficients shown in FIG. 2 are called the reconstructed filter coefficients. Because of the periodicity of the known data function y_(k)=y_(k+n), the compressed data x_(j) in equation (8) can be obtained by using the following linear correlation equation:

$\begin{matrix} \begin{matrix} {x_{j} = {\sum\limits_{k = {- 5}}^{5}{a_{k}\; y_{j + k}}}} & \; & {0 \leq j \leq {n - 1}} \end{matrix} & (9) \end{matrix}$ where the boundary conditions are y_(−i)=y_(n−i) for 5≦i≦11 and y_(i)=y_(i−n) for n≦i≦n+4. In equation (8), x_(j) can be obtained by correlating the known y_(j) in equation (3) for −5≦j≦n+4 with the reconstructed filter coefficients a_(k) for −5≦k≦5. Computing x_(j) in equation (9) involves n correlation coefficients of only 11 points for τ=2. Hence, the number of real multiplications and additions of this direct filter computation needed to implement x_(j) in equation (9) are 11×n and 10×n, respectively. It is easy to see that the pipeline architecture can be developed to directly compute a linear correlation. Its modularity makes it well suitable for hardware or firmware implementation.

For the CSI algorithm implemented by the FFT algorithm, if n is a power of two, in this case, n₁=n. Otherwise, the data y_(j) is expanded from n pixels to n₁=2^(l) pixels by appending zeros to the edge of this data, where l is the smallest integer such that 2^(l)>n. For a given n, the number of real multiplications and additions needed to implement x_(j) in equation (9) is 4(2n₁ log n₁+n₁) and 2(2n₁ log n₁+n₁), respectively.

For the CSI algorithm implemented by the 9-point WDFT algorithm, if n is divided by 7, then n=q·7+r, where q and r are the quotient and remainder of n, respectively. The number of the coefficients y_(j), namely n is divided into q+a overlapping 9-point sub-functions, where a=1 if r−2≧1 and a=0, if r−2≦0. Each 9-point sub-function is transformed by the direct use of the 9-point WDFT. It follows from D. P. Kolba and T. W. Parks, “A prime factor FFT algorithm using high-speed convolution,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-25, no. 4, pp. 281–294, August 1977 [4], the entire content of which is hereby expressly incorporated by reference, that the computation of the 9-point WDFT requires 8 real multiplies, 49 real adders, and 2 shifts. Thus, the number of real multiplications and additions used to perform x_(j) in (9) is (q+a)(2×8+9) and (q+a)(2×49+9), respectively.

Since the decimation requires the computation of y_(j) in equation (3) and x_(j) in equation (9), the total numbers of real multiplies needed to compute the decimation using the FFT, the WDFT, and the direct algorithms are 9×n+4(2n log n+n), 9×n+(q+a)(2×8+9), and 9×n+11×n, respectively. However, the total number of real adders needed to compute the decimation using the FFT, the WDFT, and the direct algorithms are 8×n+2(2n log n+n), 8×n+(q+a)(2×49+9), and 8×n+10×n, respectively. The complexities of computing the decimation at different pixels using the three algorithms given above are summarized in Table I.

These three algorithms are compared in this table giving the number of real multiplications and additions needed to perform the decimation. From this table, one observes that the FFT algorithm required substantially much more real multiplications and additions than any of the other two methods. The comparison of the direct method and WDFT algorithm in Table I shows that the direct method requires about 1.59 times more real multiplications and 1.30 times less real additions than that of the WDFF algorithm, respectively. However, as mentioned earlier, a pipeline convolution algorithm can be developed to compute the linear convolution defined in equation (9). As a consequence, the direct method can be easier to implement in hardware or firmware than the FFT and WDFT algorithms.

Let x(t₁,t₂) be a doubly periodic image signal of periods n₁τ and n₂τ with respect to the two integer variables t₁ and t₂, where n₁ and n₂ are also integers. The 2-D cubic spline function r(t₁,t₂) is defined by r(t₁,t₂)=r(t₁)·r(t₂), where the 1-D CCI r(t_(i)) is given in (1) and is also assumed to be a periodic function of period n_(i)τ for i=1, 2. Finally, let ψ_(k) ₁ _(,k) ₂ (t₁,t₂)=r(t₁−k₁τ,t₂−k₂τ)=r(t₁−k₁τ)·r(t₂−k₂τ)=ψ_(k) ₁ (t₁)·ψ_(k) ₂ (t₂) for 0≦k_(i)≦n_(i)−1, where 1≦i≦2. Again, it is shown in [1] that the set of equations as 2-D circular convolution is

$\begin{matrix} \begin{matrix} {y_{j_{1},j_{2}} = {\sum\limits_{k_{1} = 0}^{n_{1} - 1}{\sum\limits_{k_{2} = 0}^{n_{2} - 1}{x_{k_{1},k_{2}}b_{{({j_{1} - k_{1}})}_{n_{1}},{({j_{2} - k_{2}})}_{n_{2}}}}}}} \\ {{{{{for}\mspace{14mu} 0} \leq j_{i} \leq {n_{i} - {1\mspace{14mu}{and}\mspace{14mu} i}}} = 1},2} \end{matrix} & (10) \end{matrix}$ where, for 0≦j_(i)≦n_(i)−1 and i=1, 2, x_(k) _(i) _(,k) ₂ denotes the compressed values at the sampling points,

$\begin{matrix} {y_{j_{1},j_{2}} = {\sum\limits_{n_{1} = {{{- 2}\;\tau} + 1}}^{{2\;\tau} - 1}{\sum\limits_{n_{2} = {{{- 2}\;\tau} + 1}}^{{2\;\tau} - 1}{{x\left( {{m_{1} + {j_{1}\tau}},{m_{2} + {j_{2}\tau}}} \right)}{r\left( {m_{1},m_{2}} \right)}}}}} & (11) \end{matrix}$

for 0≦j_(i)≦n_(i)−1 and i=1,2 and

$\begin{matrix} \begin{matrix} {b_{{({j_{1} - k_{1}})}_{n_{1}},{({j_{2} - k_{2}})}_{n2}} = {\sum\limits_{m_{1} = {{{- 2}\;\tau} + 1}}^{{2\;\tau} - 1}{\sum\limits_{m_{2} = {{{- 2}\;\tau} + 1}}^{{2\;\tau} - 1}{r\left( {{m_{1} + {\left( {j_{1} - k_{1}} \right)\tau}},{m_{2} +}} \right.}}}} \\ {\left. {\left( {j_{2} - k_{2}} \right)\tau} \right){r\left( {m_{1},m_{2}} \right)}} \end{matrix} & (12) \end{matrix}$

Recall that r(m₁,m₂)=r(m₁)r(m₂). Then equation (11) becomes

$\begin{matrix} {\begin{matrix} {y_{j_{1},j_{2}} = {\sum\limits_{m_{2} = {{{- 2}\;\tau} - 1}}^{{2\;\tau} - 1}\left( {\sum\limits_{m_{1} = {{{- 2}\;\tau} - 1}}^{{2\;\tau} - 1}{x\left( {{m_{1} + {j_{1}\tau}},} \right.}} \right.}} \\ {\left. {\left. {m_{2} + {j_{2}\tau_{2}}} \right){r\left( m_{1} \right)}} \right){r\left( m_{2} \right)}} \\ {= {\sum\limits_{m_{2} = {{{- 2}\;\tau} + 1}}^{{2\;\tau} - 1}{{y\left( {{j_{1}\tau},{m_{2} + {j_{2}\tau}}} \right)}{r\left( m_{2} \right)}}}} \end{matrix}{where}{{y\left( {{j_{1}\tau},{m_{2} + {j_{2}\tau}}} \right)} = {\sum\limits_{m_{1} = {{{- 2}\;\tau} + 1}}^{{2\;\tau} - 1}{{x\left( {{m_{1} + {j_{1}\tau}},{m_{2} + {j_{2}\tau}}} \right)}{{r\left( m_{1} \right)}.}}}}} & (13) \end{matrix}$

To obtain y_(j) ₁ _(,j) ₂ in equation (13), we first convolve the 1-D CCI function r(t) with each row of the data matrix x(t₁,t₂). This resulting data matrix is then convolved by column with the same CCI function r(t).

In equation (10), the 1-D convolution can be used to solve for x_(k) ₁ _(,k) ₂ . To illustrate this, again since r(t₁−k₁τ,t₂−k₂τ)=r(t₁−k₂τ)r(t₂−k₂τ) and r(m₁,m₂)=r(m₁)r(m₂), then equation (12) becomes

$\begin{matrix} \begin{matrix} {b_{{({j_{1} - k_{1}})}_{m_{1}},{({j_{2} - k_{2}})}_{n_{2}}} = {\sum\limits_{m_{1} = {{{- 2}\;\tau} + 1}}^{{2\;\tau} - 1}{\sum\limits_{m_{2} = {{{- 2}\;\tau} + 1}}^{{2\;\tau} - 1}{{r\left( {m_{1} + {\left( {j_{1} - k_{1}} \right)\tau}} \right)}r}}}} \\ {\left( {m_{2} + {\left( {j_{2} - k_{2}} \right)\tau}} \right){r\left( m_{1} \right)}{r\left( m_{2} \right)}} \\ {= {b_{{({j_{1} - k_{1}})}_{h_{1}}} \cdot b_{{({j_{2} - k_{2}})}_{n_{2}}}}} \end{matrix} & (14) \end{matrix}$ where

$b_{{({j_{1} - k_{1}})}_{ni}} = {\sum\limits_{m = {{{- 2}\;\tau} + 1}}^{{2\;\tau} - 1}{{r\left( {m_{i} + {\left( {j_{i} - k_{i}} \right)\tau}} \right)}{r\left( m_{i} \right)}}}$ for i=1, 2. The substitution of equation (14) into equation (10) yields the following 2-D circular convolution:

$\begin{matrix} {y_{j_{1},j_{2}} = {\sum\limits_{k_{1} = 0}^{n_{1} - 1}{\sum\limits_{k_{2} = 0}^{n_{2} - 1}{x_{k_{1},k_{2}}{b_{{({j_{1} - k_{1}})}_{n_{1}}} \cdot b_{{({j_{2} - k_{2}})}_{n_{2}}}}}}}} & (15) \end{matrix}$

Equation (15) can be decomposed into two n₁- and n₂-point cyclic convolutions as follows:

$\begin{matrix} {{z_{j_{1},k_{2}} = {\sum\limits_{k_{1} = 0}^{n_{1} - 1}{x_{k_{1},k_{2}}b_{{({j_{1} - k_{1}})}_{n_{1}}}}}},} & (16) \\ {y_{j_{1},j_{2}} = {\sum\limits_{k_{2} = 0}^{n_{2} - 1}{z_{j_{1},k_{2}}b_{{({j_{2} - k_{2}})}_{n_{2}}}}}} & (17) \end{matrix}$ where the filter coefficients b₀, b₁, . . . , b_(n) _(i) ⁻¹ for i=1, 2 are given in [1].

To obtain y_(j) ₁ _(,j) ₂ in equation (16) one first convolves each now of the data matrix x_(k) ₁ _(,k) ₂ with the coefficients b₀, b₁, . . . , b_(n) ₁ ⁻¹. The resulting data matrix is then convolved with the function coefficients b₀, b₁, . . . , b_(n) ₂ ⁻¹. In order to find x_(k) ₁ _(,k) ₂ in equation (17) from the known data y_(j) ₁ _(,j) ₂ and the coefficients b₀, b₁, . . . , b_(n) _(i) ⁻¹ for i=1, 2, equations (16) and (17) can be expressed explicitly in matrix from as follow: z _(j) ₁ _(,k) ₂ =[z _(0,k) ₂ , . . . , z _(n) _(i) _(−1,k) ₂ ]^(T) =B ₁ [x _(0,k) ₂ ,x _(1,k) ₂ , . . . x _(n) ₁ _(−1,k) ₂ ]^(T) for 0≦k ₂ ≦n ₂−1  (18) y _(j) ₁ _(,j) ₂ =[y _(j) ₁ _(,0) ,y _(j) _(i) _(,1), . . . , y_(j) ₁ _(,n) ₂ ⁻¹]^(T) =B ₂ [z _(j) ₁ _(,0) ,z _(j) ₁ _(,1) , . . . , z _(j) _(i) _(,n) ₂ ⁻¹]^(T) for 0≦j ₁ ≦n ₁−1  (19) where the matrices B_(i)=[b₀,b₁,b₂, . . . , b_(n) _(i) ⁻¹]^(C) for i=1, 2. In equation (15), the direct computation can be used to solve for x_(k) ₁ _(,k) ₂ . To see this, multiplying both sides of equation (19) by the inverse matrix B₂ ⁻¹, yields z _(j) ₁ _(,k) ₂ =[z _(j) ₁ _(,0) ,z _(j) ₁ _(,1) , . . . , z _(j) ₁ _(,n) ₂ ⁻¹ ]=B ₂ ⁻¹ [y _(j) ₁ _(,0) ,y _(j) ₁ _(,1) , . . . , y _(j) ₁ _(,n) ₂ ⁻¹]^(T) for 0≦j ₁ ≦n ₁−1  (20)

Similarly, multiplying both sides of equation (18) by the inverse matrix B₁ ⁻¹, one obtains x _(k) ₁ _(,k) ₂ =[x _(0,k) ₂ ,x _(1,k) ₂ , . . . , x _(n) ₁ _(−1,k) ₂ ]=B ₁ ⁻¹ [z _(0,k) ₂ ,z _(1,k) ₂ , . . . , z _(n) ₁ _(−1,k) ₂ ]^(T) for 0≦k₂ ≦n ₂−1  (21)

It follows from equations (20) and (21) that, to solve x_(k) ₁ _(,k) ₂ in equation (14), the reconstructed filter coefficients of FIG. 2 a₁, a₂, . . . , a_(n) ₂ ⁻¹ are convolved with each column of the compressed matrix Y_(j) ₁ _(,j) ₂ to obtain z_(j) ₁ _(,k) ₂ then, the row of this resulting data matrix z_(j) ₁ _(,k) ₂ is convolved with the filter coefficients a₁, a₂, . . . , a_(n) ₂ ⁻¹ to obtain the reconstructed image of x_(k) ₁ _(,k) ₂ .

The new encoding method for the 2-D image data for τ=2 is summarized in the following three steps:

-   -   1. Apply equation (13) with the 1-D CCI function given in (1) to         the original image to find all of the coefficients y_(j) ₁ _(,j)         ₂ . In other words convolve the 1-D CCI function with each         coordinate of the data matrix x(m₁, m₂) to obtain the         coefficients y_(j) ₁ _(,j) ₂ .     -   2. From the known b_(j), the circular matrix in equation (5) can         be constructed. Then, compute the inverse matrix of B, namely A.         Finally, the compressed filter coefficients a_(j) for −5≦j≦5 can         be found by the use of the matrix in equation (5).     -   3. Apply equation (20) and equation (21) with the compressed         filter coefficients a_(j) to obtained the reconstructed data         x_(k) ₁ _(,k) ₂ . In other words convolve the filter         coefficients a_(j) for −5≦j≦5 with each coordinate of the matrix         y_(j) ₁ _(,j) ₂ to obtain the reconstructed image of x_(k) ₁         _(,k) ₂         Decoding Algorithm of the Compressed 2-D Signal

In the decoding process, using the reconstructed values as the sampling points (e.g., the x_(k) ₁ _(,k) ₂ data), the reconstructed points between the sampling points are obtained by means of the CCI function given in equation (1). To do this, since the transformed image data x_(k) ₁ _(,k) ₂ for 0≦k_(i)≦n_(i)−1 and i=1,2 are known, the 2-D reconstructed image s(t₁,t₂) is the 2-D convolution of the 2-D CCI function r(t₁,t₂)=r(t₁)·r(t₂) and the 2-D sampled waveform x_(k) ₁ _(,k) ₂ . Since r(t₁−k₁τ,t₂−k₂τ)=r(t₁−k₁τ)·r(t₂−k₂τ), then, for τ=2 in equation (12) in [1] becomes

$\begin{matrix} {{s\left( {t_{1},t_{2}} \right)} = {\sum\limits_{k_{1} = 0}^{n_{1} - 1}{\sum\limits_{k_{2} = 0}^{n_{2} - 1}{x_{k_{1},k_{2}}{r\left( {t_{1} - {2k_{1}}} \right)}{r\left( {t_{2} - {2k_{2}}} \right)}}}}} & (22) \end{matrix}$

Equation (22) can be decomposed into two n₁- and n₂-point convolutions as follows:

$\begin{matrix} {{{s\left( {t_{1},k_{2}} \right)} = {{\sum\limits_{k_{1} = 0}^{n_{1} - 1}{x_{k_{1},k_{2}}{r\left( {t_{1} - {2k_{1}}} \right)}\mspace{14mu}{for}\mspace{14mu} 0}} \leq k_{2} \leq {n_{2} - 1}}}{{{and}\mspace{14mu} 0} \leq t_{1} \leq {2n_{1}\mspace{14mu}{and}}}} & (23) \\ {{s\left( {t_{1},t_{2}} \right)} = {{\sum\limits_{k_{2} = 0}^{n_{2} - 1}{{s\left( {t_{1},k_{2}} \right)}{r\left( {t_{2} - {2k_{2}}} \right)}\mspace{14mu}{for}\mspace{14mu} 0}} \leq t_{2} \leq {2n_{2}}}} & (24) \end{matrix}$

Thus, from equations (23) and (24) the discrete data of each row can be interpolate from the transformed and compressed image data x_(k) ₁ _(,k) ₂ with a similar interpolation for the given discrete data of each column. The above method is a bilinear interpolation described in W. K. Pratt, Digital Images Processing, 2^(nd) ed., New York: Wiley, 1991 [5], the relevant contents of which are hereby expressly incorporated by reference.

Fast JPEG Encoder and Decoder

According to one embodiment of the present invention, a new JPEG encoder-decoder model includes the 1/4 new CSI scheme for preprocessing and the 1/4 CSI for the post-processing needed in the standard JPEG algorithm as shown in FIG. 3. In this model, an original image in RGB color space is converted to another preliminary image in YUV color space (see, for example, [5]) prior to the 1/4 CSI process.

There are two steps in the encoder. The first step is the pre-processing which uses the 1/4 new CSI scheme for the Y, U, and V images, individually. At the end of the 1/4 CSI computation, these separate Y, U, and V images are combined to yield one YUV image. The second step for post-processing uses the 1/4 CSI for the Y, U, and V images. Finally, the Y, U, and V images are combined to yield the YUV format. Then, this YUV images is converted to the final reconstructed RGB image.

Let x_(k) ₁ _(,k) ₂ and s_(k) ₁ _(,k) ₂ be the original and reconstructed images, respectively, and let k₁, k₂ for 0≦k₁≦M−1 and 0≦k₂≦N−1 be the index numbers that determine the vertical and horizontal positions of objects in the images. The mean square error (MSE) and the PSNR of the 2-D signal are given in [1], respectively.

Experimental results for the 2-D signal image are compared using the CSI scheme of [1], the fast CSI schemes given in [2], and the new CSI scheme given in this application. These results are computed and are shown in Table II. The PSNR of the 2-D signal are calculated for the standard images of size 512×512. That is, the original image is decimated by the disclosed CSI scheme to obtain data samples with a compression ratio of 4:1. In addition, the reconstructed values between the sampling points are interpolated by the 1/4 CCI in equation (1) to obtain the reconstructed image. It is seen from this table that the new CSI scheme has the same PSNR as the CSI and the FCSI schemes.

Table III lists the PSNR values of the RGB color Lena reconstructed image of size 512 by 512 at different compression ratios for the CSI-JPEG of [1], the FCSI-JPEG of [2], and the new CSI-JPEG codec disclosed in this application. From this table, one observes that for the same compression ratios, the PSNR of the image of the new CSI-JPEG codec are similar to the two other schemes.

FIG. 4 shows the reconstructed image of Lena at the same compression ratio of 100:1, using the conventional CSI-JPEG, FCSI-JPEG, and the new CSI-JPEG codec. Clearly, the Lena image using our disclosed method indicates a subjective quality of reconstructed image similar to any of the other two methods. The image of FIG. 4 is of size 512 by 512 with a compression ratio 100:1. FIG. 4( a) is an original Lena image, FIG. 4( b) is an image reconstructed by the original CSI-JPEG with the PSNR of Y image equivalent to 35.08 dB, FIG. 4( c) shows an image reconstructed by the FCSI-JPEG with PSNR of Y image equivalent to 35.07 dB, and FIG. 4( d) depicts an image reconstructed by the new CSI-JPEG with PSNR of Y image equivalent to 34.51 dB.

Finally, the new CSI-JPEG codec was implemented on a 800-MHz Intel Pentium III personal computer using a C program. The computation time of this new simplified algorithm is given in Table IV. It follows from Table IV that the new CSI-JPEG encoder requires 1.28 sec compared with 1.11 sec for the FCSI-JPEG encoder and 1.52 sec for the CSI-JPEG encoder, respectively. Although, the new CSI-JPEG decoder requires almost the same computation time as the other two decoders, the new CSI-JPEG encoder requires 0.17 sec more time and 0.24 sec less time than the FCSI-JPEG decoder and the CSI-JPEG decoder, respectively.

It will be recognized by those skilled in the art that various modifications may be made to the illustrated and other embodiments of the invention described above, without departing from the broad inventive scope thereof. It will be understood therefore that the invention is not limited to the particular embodiments or arrangements disclosed, but is rather intended to cover any changes, adaptations or modifications which are within the scope and spirit of the invention as defined by the appended claims.

TABLE 1 COMPLEXITY OF COMPUTING DECIMATION USING FFT, WDFT AND DIRECT METHODS FFT Algorithm WDFT Algorithm Direct Algorithm No. Real No. Real No. Real No. Real No. Real No. Real n Mult. Add. Mult. Add. Mult. Add. 640 22688 13584 8110 15178 12800 11520 480 16536 9948 6020 11116 9600 8640 320 10573 6407 4130 7910 6400 5760 240 7690 4685 3060 5771 4800 4320 352 11747 7106 4468 8380 7040 6336 288 9410 5713 3592 6798 5760 5184 512 17753 10668 6458 12014 10240 9216

TABLE II PSNR (dB) OF 2-D IMAGE OF SIZE 512 BY 512 WITH COMPRESSION RATIO 4:1 FOR τ = 2 Image CSI [1] FCSI [2] NEW CSI Peppers 33.21 33.20 32.89 Lake 30.87 30.86 30.51 Couple 30.51 30.36 30.23 Crowd 33.86 33.86 33.42 Lena 35.08 35.07 34.51

TABLE III PSNR VALUE AT DIFFERENT COMPRESSION RATIOS OF 2D IMAGE OF SIZE 512 BY 512 for CSI-JPEG, FCST-JPEG, AND NEW CSI-JPEG CODE PSNR (dB) Compress CSI-JPEG FCSI-JPEG NEW CSI-JPEG Ratio Y U V Y U V Y U V 125:1 30.27 34.89 35.00 30.37 34.98 35.09 29.89 33.55 34.56 100:1 31.20 35.66 35.85 31.15 35.64 35.67 30.92 35.42 35.66  75:1 32.15 36.45 36.42 32.14 36.41 36.33 32.02 36.10 36.37  50:1 33.36 37.42 37.32 33.51 37.51 37.24 33.28 37.32 37.28

TABLE IV COMPUTATION TIME (SECONDS) OF THE COLOR IMAGE OF SIZE 640 BY 480 AT THE COMPRESSION RATIO OF 100:1 IMPLEMENTED ON A 800 MHz INTEL PENTIUM III PERSONAAL COMPUTER. Encoder Decoder JPEG JPEGB Decimation Encoder Total Decoder Interpolation Total CSI-JPEG 0.68 0.84 1.52 0.23 0.1 0.33 FCSI-JPEG 0.27 0.84 1.11 0.23 0.08 0.31 nCSI-JPEG 0.44 0.84 1.28 0.23 0.1 0.33 

1. A method performed by a computer for encoding a one-dimensional (1-D) image signal x(t), the image signal being a periodic signal with period N=nτ, where n is an integer and τ is a fixed positive integer, the method comprising: defining a 1-D cubic-spline filter r(t) by $\begin{matrix} {{r(t)} = \left\{ \begin{matrix} {{{\left( {3/2} \right){t}^{3}} - {\left( {5/2} \right){t}^{2}} + 1},} & {0 \leq {t} < 1} \\ {{{{- \left( {1/2} \right)}{t}^{3}} + {\left( {5/2} \right){t}^{2}} - {4{t}} + 2},} & {{1 \leq {t} < 2};} \\ {0,} & {{2 \leq {t}};} \end{matrix} \right.} & (1) \end{matrix}$ applying the filter to the image signal x(t) with $\begin{matrix} \begin{matrix} {{y_{j} = {\sum\limits_{t = {{{- 2}\;\tau} + 1}}^{{2\;\tau} + 1}{{r(t)}\;{x\left( {t + {j\;\tau}} \right)}}}},} & \; & {{0 \leq j \leq {n - 1}},} \end{matrix} & (3) \end{matrix}$  to compute y_(j) where y_(j) is an n-point circular correlation of filter r(t) and image signal x(t); computing a cyclic matrix of size n×n B=[b₀,b₁, . . . ,b_(n−1)]_(C), where matrix coefficients b_(k) are the autocorrelation coefficients for 0≦k≦n−1 between two filters r(t) and r(t+kτ), $\begin{matrix} \begin{matrix} {b_{k} = {\sum\limits_{t = {{{- 2}\;\tau} + 1}}^{{2\;\tau} + 1}{{r(t)}\;{{r\left( {t + {k\;\tau}} \right)}.}}}} & \; & \; & {0 \leq k \leq {n - 1}} \end{matrix} & (4) \end{matrix}$ and where for τ+2, b₀=α=1.641, b₁=b_(n−1)=β=0.246, b₂=b_(n−2)=γ=−0.07, b₃=b_(n−3)=δ=0.004, b₄₌₀, b₅₌₀, . . . , b_(n−4)=0; computing a reconstructed filter by computing inverse matrix of B, A=B⁻¹, where the reconstructed filter A is a circular matrix of size n×n, where A=[a₀,a₁,a₂,a₃,a₄,a₅,a₆, . . . ,a_(n−6),a_(n−5),a_(n−4),a_(n−3),a_(n−2),a_(n−1)]_(C)  (6) and where the coefficients of the reconstructed filter A are a₀=0.646, a₁=a_(n−1)=−0.109, a₂=a_(n−2)=0.0467, a₃=a_(n−3)=−0.014, a₄=a_(n−4)=0.0046, a₅=a_(n−5)=−0.00148, and a₆≅a₇≅a₈≅ . . . ≅a_(n−6)≅0; and computing X=B ⁻¹ Y=AY  (7) where Y=[y₀,y₁, . . . ,y_(n−1)]^(T) is tranpose of row vector Y and X=[x₀,x₁, . . . ,x_(n−1)]^(T) is transpose of the row vector X, by computing a circular convolution: $\begin{matrix} {{x_{j} = {\sum\limits_{k = 0}^{n - 1}{y_{k}\mspace{11mu} a_{{({j - k})}_{n}}}}},} & (8) \end{matrix}$ where x_(j) are convolution coefficients between time samples y_(k) and a_(k) for 0≦k,j≦n−1 and are the encoded values of the image signal x(t) at sampling points and where (j−k)_(n) denotes a residue (j−k) module n.
 2. A JPEG decoder for decoding a one-dimensional (1-D) image signal x(t), the image signal being a periodic signal with period N=nτ, where n is an integer and τ is a fixed positive integer, comprising: means for defining a 1-D cubic-spline filter r(t) by $\begin{matrix} {{r(t)} = \left\{ \begin{matrix} {{{\left( {3/2} \right){t}^{3}} - {\left( {5/2} \right){t}^{2}} + 1},} & {0 \leq {t} < 1} \\ {{{{- \left( {1/2} \right)}{t}^{3}} + {\left( {5/2} \right){t}^{2}} - {4{t}} + 2},} & {{1 \leq {t} < 2};} \\ {0,} & {{2 \leq {t}};} \end{matrix} \right.} & (1) \end{matrix}$ means for applying the filter to the image signal x(t) with $\begin{matrix} \begin{matrix} {{y_{j} = {\sum\limits_{t = {{{- 2}\;\tau} + 1}}^{{2\;\tau} + 1}{{r(t)}\;{x\left( {t + {j\;\tau}} \right)}}}},} & \; & {{0 \leq j \leq {n - 1}},} \end{matrix} & (3) \end{matrix}$  to compute y_(j)where y_(j) is an n-point circular correlation of filter r(t) and image signal x(t); means for computing a cyclic matrix of size n×n B=[b₀,b₁, . . . ,b_(n−1)]_(C), where matrix coefficients b_(k) are the autocorrelation coefficients for 0≦k≦n−1 between two filters r(t) and r(t−kτ), $\begin{matrix} \begin{matrix} {{b_{k} = {\sum\limits_{t = {{{- 2}\;\tau} + 1}}^{{2\;\tau} + 1}{{r(t)}\;{r\left( {t + {k\;\tau}} \right)}}}},} & \; & \; & {0 \leq k \leq {n - 1}} \end{matrix} & (4) \end{matrix}$ and where for τ+2, b₀=α=1.641, b₁=b_(n−1)=β=0.246, b₂=b_(n−2)=γ=−0.07, b₃=b_(n−3)=δ=0.004, b₄=0, b₅=0, . . . , b_(n−4)=0; means for computing a reconstructed filter by computing inverse matrix of B, A=B⁻¹, where the reconstructed filter A is a circular matrix of size n×n, where A=[a₀,a₁,a₂,a₃,a₄,a₅,a₆, . . . ,a_(n−6),a_(n−5), a_(n−4),a_(n−3),a_(n−2),a_(n−1)]_(C)  (6) and where coefficients of the reconstructed filter A are a₀=0.646, a₁=a_(n−1)=−0.109, a₂=a_(n−2)=0.0467, a₃=a_(n−3)=−0.014, a₄=a_(n−4)=0.0046, a₅=a_(n−5)=−0.00148, and a₆≅a₇≅a₈≅ . . . ≅a_(n−6)≅0; and means for computing X=B ⁻¹ Y=AY  (7) where Y=[y₀,y₁, . . . ,y_(n−)]^(T) is transpose of row vector Y and X=[x₀,x₁, . . . ,x_(n−1)]^(T) is transpose of the row vector X, by computing a circular convolution: $\begin{matrix} {{x_{j} = {\sum\limits_{k = 0}^{n - 1}{y_{k}\mspace{11mu} a_{{({j - k})}_{n}}}}},} & (8) \end{matrix}$ where x_(j) are convolution coefficients between time samples y_(k) and a_(k) for 0≦k,j≦n−1 and are the encoded values of the image signal x(t) at sampling points and and where (j−k)_(n) denotes a residue (j−k) module n.
 3. A digital signal processor (DSP) having stored thereon a set of instructions including instruction for processing a one-dimensional (1-D) image signal x(t), the image signal being a periodic signal with period N=nτ, where n is an integer and τ is a fixed positive integer, the instruction when executed by the DSP perform the steps of: defining a 1-D cubic-spline filter r(t) by $\begin{matrix} {{r(t)} = \left\{ \begin{matrix} {{{\left( {3/2} \right){t}^{3}} - {\left( {5/2} \right){t}^{2}} + 1},} & {0 \leq {t} < 1} \\ {{{{- \left( {1/2} \right)}{t}^{3}} + {\left( {5/2} \right){t}^{2}} - {4{t}} + 2},} & {{1 \leq {t} < 2};} \\ {0,} & {{2 \leq {t}};} \end{matrix} \right.} & (1) \end{matrix}$ applying the filter to the image signal x(t) with $\begin{matrix} \begin{matrix} {{y_{j} = {\sum\limits_{t = {{{- 2}\;\tau} + 1}}^{{2\;\tau} + 1}{{r(t)}\;{x\left( {t + {j\;\tau}} \right)}}}},} & \; & {{0 \leq j \leq {n - 1}},} \end{matrix} & (3) \end{matrix}$  to compute y_(j)where y_(j) is an n-point circular correlation of filter r(t) and image signal x(t); computing a cyclic martix of size n×n B=[b₀,b₁, . . . ,b_(n−1)]_(C), where matrix coefficients b_(k) are the autocorrelation coefficients for 0≦k≦n−1 between two filters r(t) and r(t+kτ), $\begin{matrix} \begin{matrix} {b_{k} = {\sum\limits_{t = {{{- 2}\;\tau} + 1}}^{{2\;\tau} + 1}{{r(t)}\;{{r\left( {t + {k\;\tau}} \right)}.}}}} & \; & \; & {0 \leq k \leq {n - 1}} \end{matrix} & (4) \end{matrix}$ and where b₀=α=1.641, b₁=b_(n−1)=β=0.246, b₂=b_(n−2)=γ=0.07, b₃=b_(n−3)=δ=0.004, b₄=0, b₅=0, . . . , b_(n−4)=0; computing a reconstructed filter by computing inverse matrix of B, A=B⁻¹, where the reconstructed filter A is a circular matrix of size n×n, where A=[a₀,a₁,a₂,a₃,a₄,a₅,a₆, . . . ,a_(n−6),a_(n−5), a_(n−4),a_(n−3),a_(n−2),a_(n−1)]_(C)  (6) and where the coefficients of the reconstructed filter A are a₀=0.646, a₁=a_(n−1)=−0.109, a₂=a_(n−2)=0.0467, a₃=a_(n−3)=−0.014, a₄=a_(n−4)=0.0046, a₅=a_(n−5)=−0.00148, and a₆≅a₇≅a₈≅ . . . ≅a_(n−6)≅0; and computing X=B ⁻¹ Y=AY  (7) where Y=[y₀,y₁, . . . ,y_(n−1)]^(T) transpose of the row vector Y and X=[X₀,x₁, . . . ,x_(n−1)]^(T) is transpose of the row vector X, by computing a circular convolution: $\begin{matrix} {{x_{j} = {\sum\limits_{k = 0}^{n - 1}{y_{k}a_{{({j - k})}_{n}}}}},} & (8) \end{matrix}$ where x_(j) are convolution coefficients between time samples y_(k) and a_(k) for 0≦k,j≦n−1 and are the encoded values of the image signal x(t) at sampling points and where (j−k)_(n) denotes a residue (j−k) module n. 